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Bootstrapping the Grenander estimator 

Michael R. Kosorok* 1 

University of North Carolina- Chapel Hill 

Abstract: The goal of this paper is to study the bootstrap for the Grenander 
estimator. The first result is a proof of the inconsistency of the nonparametric 
bootstrap for the Grenander estimator at a given point. The second result is 
the development and verification of a bootstrap for the L\ confidence band for 
the Grenander estimator. As part of this work, kernel estimators are studied 
as alternatives to the Grenander estimator. We show that when the second 
derivative of the true density is assumed to be uniformly bounded, there exist 
kernel estimators with faster convergence rates than the Grenander estimator. 
We study the implications of this in developing L\ and uniform confidence 
bands and discuss some open questions. 



1. Introduction 

The Grenander estimator (Grenander [6]) is the maximum likelihood estimator 
(MLE) /„ of the density / of a positive, real random variable X, under the con- 
straint that / is monotone non-increasing. For simplicity, we will assume the support 
of X is [0, 1], and that the data used for estimation is an i.i.d. sample X±, . . . , X n 
from /. 

Not only is the Grenander estimator worthy of study in its own right, but it 
is also useful because of its connection to the MLE of the survival function from 
current status time-to-event data. Current status data arises from only observing 
the current status of the event time T at a random observation time Y . Specifically, 
one only observes 1{T < Y} and Y, where 1{A} denotes the indicator of A. The 
connection between the estimators is that the current status MLE of the cumulative 
distribution at a single point and f n {t) for a point t both have the same limiting 
distribution after suitable standardization. This limiting distribution is 

C = argmax{Z(/i) - h 2 } , 

hen 

where Z is two-sided Brownian motion with Z(0) = 0. This result was obtained for 
the Grenander estimator by Rao [16] (see also Groeneboom [7]) and for the current 
status estimator by Groeneboom [8] (see also Groeneboom and Wellner [10]). 

Because of this similarity between the Grenander estimator and the current 
status survival estimator, it appears that at least some of what we could learn about 
the Grenander estimator may be applicable on many levels to inference problems 



'Supported in part by Grant CA075142 from the National Cancer Institute. 

1 Department of Biostatistics, School of Public Health, University of North Carolina-Chapel 
Hill, 3101 McGavran-Greenberg Hall, CB 7420, Chapel Hill, NC 27599-7420, USA, e-mail: 
kosorokSunc . edu 

AMS 2000 subject classifications: Primary 62G09, 62G07; secondary 60F05, 60G15. 
Keywords and phrases: Chernoff's distribution, confidence bands, kernel estimators, L\ error, 
Monte Carlo methods, pointwise error, uniform error. 



282 



Bootstrapping the Grenander estimator 



283 



in current status and possibly other, more complex, data types. We will not pursue 
this connection further in this paper, but the interested reader should compare and 
contrast the derivations of both of these estimators as given side-by-side in Sections 
3.2.14 and 3.2.15 of van der Vaart and Wellner [18]. 

There are number of challenges with using C - Chernoff 's distribution - directly 
for inference. The first challenge is that the density for C, the form of which was 
derived by Grocneboom [9], docs not have a closed form. Computing critical values 
for the distribution is quite difficult (Dykstra and Carolan [5] , Narayanan and Sagcr 
[14]) but has been done (Groeneboom and Wellner [1 I]). A second challenge is that 
the normalization constants involved can be difficult to estimate. For these reasons, 
computationally reasonable approaches that avoid computing critical values from 
Chernoff 's distribution, and/or ameliorate the need for computing complicated nor- 
malization constants, would be very appealing. This pursuit is the theme of this 
paper. 

An obvious approach to consider because of computational simplicity is the non- 
parametric bootstrap. Unfortunately, the first main result of this paper is that the 
nonparametric bootstrap is inconsistent for pointwisc inference (i.e., inference for 
f(t) at a given value of t £ [0, 1]). We prove this rigorously in Theorem 2.1 below. 
The key argument is contained in Theorem 2.2 below and is applicable to many 
other inference settings. The inconsistency of the bootstrap was also observed by 
Abrevaya and Huang [1] for the maximum score estimator, which also has a Cher- 
noff limit, although they did not provide a rigorous proof of this. Fortunately, we 
are able to show that a smoothed bootstrap (Silverman and Young [17]) obtained 
by sampling from a certain kernel estimator is consistent. These pointwisc inference 
results will be presented in Section 2. 

It would be nice if some of the pointwisc results could be utilized in the develop- 
ment of uniform confidence bands, but this appears to be an excruciatingly difficult 
problem. However, some progress has been made for L\ confidence bands. Build- 
ing on the work of Groeneboom et al. [12], who derive the limiting distribution of 
the L\ error of the Grenander estimator, we propose a "supcrsampling" smoothed 
bootstrap. This is discussed in Section 3. One of the discoveries made in this process 
is that the assumptions needed for L\ convergence of the Grenander estimator are 
so strong that there exist kernel estimators with faster convergence rates than the 
Grenander estimator. 

We conclude the paper with a discussion of the implication of these results and 
several open questions in Section 4. The main contributions of the paper are first, a 
proof of the invalidity of the nonparametric bootstrap for the Grenander estimator, 
and, second, the development of smoothed bootstrap procedures for both pointwisc 
and L\ confidence bands. The results and ideas of this paper should prove useful in 
developing solutions to the confidence band problem for the Grenander estimator as 
well as for current status survival function estimators and other related monotone 
function estimators. 

2. Pointwise error 

The focus of this section is on pointwise inference based on the Grenander estimator. 
Before presenting the main results on the bootstrap, we first briefly review known 
asymptotic distribution results for the Grenander estimator /„. Before doing this, 
however, we make the following assumptions about the density /: 

Al. < /(l) < f(s) < f{t) < /(0) < oo, for all < t < s < 1; and 
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A2. / is diffcrentiable with derivative / satisfying 

0< inf \f(t)\ < sup |/(t)| < oo. 
te(o,i)) te(o,i) 

We may occasionally need stronger assumptions which will be introduced as needed. 

It is well known that f n is the left derivative of the least concave majorant of 
the empirical distribution function F n (see, for example, Section 3.2.14 of van der 
Vaart and Wellner [18]). Moreover, under assumptions Al and A2, we have the now 
classic result that for any t G (0, 1), 

n 1/3 (f n (t)-f(t)) - \4f(t)f(t)\^ 3 axgmax{Z{h)-h 2 } 

hes. 

= c(t)C 

(Groeneboom [8]). Both the normalizing constant c(t) and critical values for Cher- 
noff's distribution are needed for inference. 

Before proceeding to the bootstrap discussion, we point out that an alternative 
to the above approach to inference about / is to use the nonparamctric maximum 
likelihood ratio as done by Banerjee and Wellner [2] which has an asymptotically 
pivotal distribution that avoids the need to estimate a normalizing constant. The 
limiting distribution in this setting is not Chernoff's distribution but is still quite 
complicated and does not have a known, closed form. Computing critical values is 
possible but complicated (Banerjee and Wellner [3]). 

Another alternative that almost always works theoretically is the subsampling 
bootstrap of Politis and Romano [15]. The basic idea is to perform a bootstrap 
without replacement of sample size to which is much smaller than the actual sample 
size n. Provided m — > oo and m/n — > 0, the standardized subsample bootstrap will 
be valid. Unfortunately, in practice, this will not work unless n is quite large since 
the asymptotic approximation must be approximately valid for the subsample size 
to, not just valid for n. We will not pursue the subsample bootstrap further in this 
paper. 

Let F* be the usual nonparamctric bootstrap empirical distribution function, 
and let /* be the the left derivative of the least concave majorant of F* . What we 
would like to show is that w 1 / 3 (/*(i) — f n {t)), conditional on the data X\,X2, ■ ■ ■ , 
converges to the unconditional limiting distribution of ^ 1//3 (/(t) — /(£))■ Our first 
main result is that this approach is unfruitful, as we now show in the following 
theorem: 

Theorem 2.1. The nonparametric bootstrap is inconsistent for the Grenander es- 
timator, i.e., n}^(f*(t) — f n (t)) does not converge in probability, conditional on the 
data, to c(t)C, for any t £ (0, 1). 

Before giving the proof of this theorem, we present a general theorem which 
can be useful in studying bootstrap validity. Let X n be a random variable in a 
Banach space (B, || ■ |j) that converges weakly to a tight limit X, and let X n denote 
a bootstrapped version of X n based on some random weighting mechanism W n 
which is independent of the data X n used to generate X n . We say that X„ is a 
valid bootstrap if its limiting distribution conditional on X n "converges weakly" to 
X. 

We now define what "converges weakly" means in this context. Let BL\(M) be 
the collection of all Lipschitz continuous functions h : B i— > R bounded in absolute 
value by 1 and having Lipschitz constant 1, i.e., \h\ < 1 and \h(x) — h(y)\ < \\x — y\\ 
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for all x, y £ B. We say X n converges weakly conditional on the data to X if 



sup 

fieBLi(B) 



E. lXn h{X n )-Eh(X) 



0, 



in outer probability, where E.ix n denotes conditional expectation given X n , and 

provided h(X n ) is asymptotically measurable unconditionally for all h G £>Li(B). 

. p 

We denote this kind of conditional convergence X n -^ X. We also require h(X n ) to 

w 

be a measurable function of W n conditional on X n for all h e BLi(B). A more 
precise discussion of this general formulation of the bootstrap can be found in van 
dcr Vaart and Wellner [18]. The following is a general result for these kinds of 
bootstraps: 

- p 

Theorem 2.2. Assume X n X , where X is tight, and that X n ~^> X , where W n i— > 

w 

h(X n ) is measurable conditional on X n for all h G BL\(M). Then (X n ,X n ) 
(Xi,X2) unconditionally, where Xi and X 2 are independent copies of X . 

Proof. Let X\ and X 2 be two independent copies of X, which are also independent 
of the data X„ , and note that 



sup 

hGBL 1 (l 



E*h(X n ,X n )-Eh(X 1 ,X 2 ) 

E*h(X n ,X n )-E*h(X 1 ,X n ) 



< sup 

h£BLt{ 



+ sup 

heBL^l 

= A n + B n . 

Note also that for any h <G BLi(M 2 ) and any y £ B, both x i— > h(x,y) and 
x i— ► x) are members of SLi(B). As a consequence of the weak convergence of 
X n , we therefore have 



sup 

he-BLi(B 2 ) 



E. 



I^M-^rai^n) — E.\x n h(Xi, X„ 



< sup 

hgflLi( 



E,^>(X„)-E^(X 1 ) 



where — ► denotes convergence in outer probability. Provided both 



(2.1) 

and 
(2.2) 



sup 

h£-BLi(I 



sup 

heBLi(I 



E*E.\ Xn h(X n , X n ) — E*h(X n , X n ) 



E*E. lXn h{X 1 ,X n ) - E*h(X u X n ) 



0, 



we will obtain that A n — > 0. 

Arguing in a similar manner but utilizing instead the assumed weak convergence 
of X„ , we have 



sup 

/i6BLi(B 2 ) 



E ,-^Ji{Xi,X n ) — E,^ i h(Xi,X 2 ) 



< sup 



E*h(X n ) - Eh(X 2 ] 
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where Ei^ denotes conditional expectation given X\. Provided 
(2.3) sup E*E* jt h{X 1 ,X n )-E*h(X 1 ,X n ) 

heBLtiB 2 ) 1 

we will obtain that B„ — > 0, and the desired conclusion of the theorem will follow. 

The proof is essentially complete, except for establishing (2.1), (2.2), and (2.3), 
which arc primarily measurability technicalities. The uninterested reader can skip 
this part of the proof and proceed directly to the proof of Theorem 2.1 below. Since 
h(X„) is asymptotically measurable for all £>Li(B), it is asymptotically measurable 
for all h that is bounded and Lipschitz continuous for any Lipschitz constant. Thus 
the conditional weak convergence of X n implies that for every e > 0, there exists a 
compact K C B such that 

E*P (Jt n G K s Xn\ -► P(X G K & ) > 1 - e, 

for every 5 > 0, where K s = {x e B : \\x — y\\ < S, for some y G K}. Hence, by 
Fubini's theorem for outer expectations (Lemma 1.2.6 of van der Vaart and Well- 
ner [18]), X n is asymptotically tight unconditionally. Thus it is also asymptotically 
measurable unconditionally by reapplication of Lemma 1.3.13 of van der Vaart and 
Wellner [18]. Since marginal asymptotic tightness plus marginal asymptotic mea- 
surability implies joint asymptotic tightness and measurability (see Lemmas 1.4.3 
and 1.4.4 of van der Vaart and Wellner [18]), we have that (X n ,X n ) is jointly 
asymptotically tight and measurable. Thus 



sup 

hEBLill 



E*h(X n , X n ) — E*h(X n , X n ) 



0. 



and condition (2.1) follows. 

The assumed weak convergence of X n implies asymptotic measurability via 
Lemma 1.3.13 of van der Vaart and Wellner [18], and thus (2.2) also follows. 
Since X n converges weakly, (X\,X n ) jointly converges weakly, and thus (Xi,X n ) 
is asymptotically measurable. Hence 



sup 

hEBLUl 



E*h(X 1 ,X n )--E.h{X 1 ,X n ) 



0. 



and (2.3) will follow. This completes the proof in all of its formality. □ 
Proof of Theorem 2. 1 . The basic idea of the proof is to assume that 

(2-4) » 1/8 (/n(*) "/»(*)) J c(t)C, 

where the W refers to the random multinomial weights W n = . . . , W"n,n} m 

the nonparametric bootstrap, and then use Theorem 2.2 to obtain a contradiction. 
Accordingly, assume (2.4), and let X n = n 1 / 3 (/*(<) - /„(i)) and X n = n l / 5 {f n {t)- 
f(t)). Then Theorem 2.2 implies that X n + X n c(t)(Cx + C2), unconditionally, 
where Ci and C2 are two independent copies of C. 

Since Y n = X n +X n = n 1 / 3 (/* — f(t)), the above results imply that Y n converges 
unconditionally to a tight limiting distribution which has twice the variance of c(t)C 
Using arguments along the lines of those used in Section 3.2.14 of van der Vaart 
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and Wellner [18], along with properties of bootstrapped empirical processes, it is 
not hard to verify, however, that 



1/3 

(/«(*)-/(*)) - argmax /l {Z 1 (/ l )+Z 2 (/ l )-^ 2 } 



(2.5) 



where Zi and Z 2 are independent two-sided Brownian motions. 

Using symmetry properties of Brownian motion and a careful change of variables, 
we can derive that C has the same distribution as aargmax/ i {v / 2aZ(a/i) — (ah) 2 }, 
for a two-sided Brownian motion Z. Choosing a = 2 1 / 3 yields that C has the same 
distribution as 2 1 / 3 C, and thus the variance of the limiting distribution of Y„ is 
22/3 2 times the variance of c(t)C This is a contradiction, and thus the desired 
conclusion follows. □ 

We now work toward developing an asymptotically valid alternative to the non- 
parametric bootstrap. To accomplish this, we propose a version of the "smoothed" 
bootstrap (Silverman and Young [17]). The idea is that we estimate the density with 
a certain modified kernel density estimator /„, and then draw a smoothed boot- 
strap sample from /„. Our goal is to ensure that the properties of this procedure 
lead to valid inference. 

Let the kernel be K and assume the bandwidth 1/2 > h — > as n — > oo. For all 
t G [h, 1 - h], let 

1 1 „ ft-u 



L = J q -^_jdF n ( u ), 

and denote /„ as the first derivative of /„ (so far only defined on [h, 1 — ft]). For 
t G [0,h), let 

/„W=/n(/ l ) + (t-^){/( 1) WA0}, 

and for t G (1 — h, 1], let 

/n(*) = /n(l - ^ + (*- 1 + A) {/^(l - ft) A 0} . 

Finally, define 

/«(*) 



/n(*) V 



S 'Un(s) V0}ds 

We need the following assumptions on K: 

Bl. The kernel K is nonnegative with support on [—1,1]; 
B2. K is bounded and J_, K(v)dv = 1; 

B3. A" is bounded, < for all v G [-1,1], j\k(v)dv = 0, and 

J j vK(v)dv = — 1; and 
B4. \K\ is uniformly bounded over (— 1, 1). 

Two examples of kernels that satisfy B1-B4 are K(v) — (3/4)(l — v 2 ) and K(y) = 
(15/16)(l-w 2 ) 2 . 

We now have the following lemma: 
Lemma 2.1. Provided h = R n n~ a , where < i?„+i?~ 1 = Op(l) and a G (0, 1/3), 
we have the following under assumptions A1-A2 and B1-B4-' 
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(i) /„ is uniformly consistent for f ; 

(ii) There exists constants < a < b < oo such that 

-b-o P (l)< inf /«(*)< sup /«(*) < -a + o P (l). 
te(o,i) t 6 (0,l) 

Under the additional assumption 

A3. /(£) is continuous at t = to, for some to € (0, 1), 
we also have 

(iii) /i 1) (to)-/(to)+op(l). 

Proof. The proof of conclusion (i) follows from standard arguments, and we omit 
the details. For (ii), we use change of variables to obtain that 

1 h- 2 K ftZ^j d¥ n (u) - f(t) = n- 1 ' 2 h- 2 K (tZ^j dM n (u) 

+h- 1 Jk(v)(f(Jt-hv)-f(t))dv 



Op n'^h' 2 



sup 

nls — 1 1 <h 



h- L / K{v) -f{t hv )hv 




where u i— > H n (u) = G n (it) — G„(i), G„ = -^(Fn — F) and is on the line 
segment between t and £ — hv. By Al and A2 combined with the fact that 



-V 2 h-^Jlog(^J=o(l), 



conclusion (ii) follows. Conclusion (iii) follows because, when / is continuous at to, 
f(t hv ) -» f(f) at t = t . □ 

Now let /* be the left derivative of the least concave majorant of the distribution 
function of a sample of size n drawn from /„. Computationally, this is easy to do 
using rejection sampling applied to /„ so that normalization of f n is not needed. 
We have the following result: 

Proposition 2.1. Under conditions A1-A3 and B1-B4, 

n 1/3 (f~:(t ) ~ /n(to))£c(to)C, 

where * denotes the random component of the smoothed bootstrap. In other words, 
the proposed smoothed bootstrap is consistent in probability. 

Sketch of proof. The proof follows the same general arguments used in the proof 
of the weak convergence of n 1 / 3 (/„(i) — f(t)). The main idea is that because f n 

satisfies the conclusions of Lemma 2.1, it satisfies assumptions Al and A2, for 

~ p 

all n large enough with probability approaching 1, and both f n (to) ™> f(to) and 

fn (to) - * f(to)- The key challenge is to obtain empirical process results for G„ = 
\fri(F n — P n ), where P„ is the empirical distribution for an i.i.d. sample from P n , 
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where P n changes with n. We need modulus of continuity results which are uniform 
in P n . Some related uniformity concepts and results are given in Chapter 2.8 of van 
der Vaart and Wellner [18]. In our case, P n is the probability measure obtained by 
integrating /„. We omit the remaining details (which are lengthy). □ 



3. Lx Error 



Under A1-A2 and the additional assumption 
A3'. sup te(0il) |/(t)| < oo, 
Grocncboom et al. [12] proved that 

n 1/e L 1 ' 3 J \f n (t) - f(t)\dt - - N(0, a 2 ), 

where 

/'OC 

<r 2 =8 cov(\Z(0)\,\Z(x)\)dx 
Jo 

and, for a diffcrcntiable density g, 



H{g) = 2E|£(0)| 



\g(t)g{t) 



1/3 



dt. 



In the above, the process £ is a stationary process constructed from a two-sided 
Brownian motion Z as follows: 

£(t) = arg max {Z(t + h) - Z(t) - h 2 } . 

We will utilize the smoothed bootstrap /* again, for inference in this setting, 
but we will need a "smoother" kernel and larger bandwidth. In particular, we need 
the additional assumptions 

B5. j\ vK(v)dv = 0, j\ K(v)dv = 0, and j\ vK(v)dv = 0; and 
B6. \{d/{dt))K{t)\ is uniformly bounded over (—1,1). 

A kernel that satisfies B1-B6 is K(v) = (15/16)(1 - v 2 ) 2 . 

We have the following lemma. The proof is similar to the proof of Lemma 2.1, 
and we omit the details. 

Lemma 3.1. Assume h = R n n~ a , where < R n + R^ 1 = Op(l) and a € 
(1/6,1/5). Under A1-A2, A3 ', and B1-B6, we have the following: 

(i) sup tmi] \f n (t)-f(t)\=0 P (n- 2a ); 

(ii) BU Pt6(0il) IfP® - f(t)\ = P (n- a ); and 

(iii) There exists a constant a < oo such that 

sup \fi 2 \t)\<a + o P (l). 
*e(o,i) 



In particular, we have that 



1 r, 



f { n\t)Ut) 



1/3 



dt 



■Mm 



1/3 



dt = o P (n~ 1/6 ) 
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Here is the proposed procedure. Let 

• /* be a bootstrapped Grcnander estimator based on a i.i.d. sample Xj" 
drawn from /„ and 

• fn*m b e an additional bootstrapped Grenander estimator based on an i.i.d. sample 
Xi* , . . . , X** also drawn from f n but independent of the first sample, where m 
is much larger than n, i.e., wc require m/n — > oo. 

Compute 



based on one large bootstrap realization. Finally, estimate the 1 — a upper critical 
value, which estimate we denote C a , of the bootstrapped distribution of 

• l 



f*(t)-f n (t) dt-/l n , r 



based on repeating the bootstrap /* . Note that the large bootstrap (a "supcrsamplc 
bootstrap" ) is only computed once in the process of obtaining this critical value. 

We have the following proposition: 
Proposition 3.1. Assume the conditions of Lemma 3.1 and that C a is obtained 
through using the above procedure. Then the set of densities 

r.1 



9 



fn(t)-g(t) 



dt <n 1/3 An, 



-1/2(7 



has asymptotically (1 — a) coverage, provided m/n — > oo. 



Sketch of proof. The proof is very lengthy and we omit the details. As with the proof 
of Proposition 2.1, the basic argument is that f n shares the required properties 
of / for all sufficiently large n with probability approaching 1, as a consequence 
of Lemma 3.1. Under these circumstances, the arguments in the proof given in 
Groeneboom et al. [12] can be carried over from / to f n . Accordingly, we first 
obtain that 

»i 



and 



n 



,1/6 



1/6 J T.V3 



!/:(*) -fn (t)\dt -M(/«n-AT(0,<7 2 ) 



,1/3 



- ut)\dt-Kfn) p«(oy) 



Thus /2 n>m — (J-(fn) = Op(m = op(n 1 ^ 6 ), conditionally on the data, and 
hence both 



(3.1) 
and 



l/n(*) -/»(*)!# "An,* 



P 



7V(0,a 2 ) 



1/6 J „l/3 



\f n {t)-f{t)\dt-(i 7 



= n 1 / 6 



1/3 / |/n(*)-/W|dt-M/) \+Op(l) 



N(Q,ct z ), 

since (J,(f n ) = fJ,(f) + op(n -1 / 6 ) by lemma 3.1 and the restriction that a > 1/6. 
Combining this with (3.1), we obtain the desired conclusion. □ 
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4. Discussion and open questions 



We note that Abrevaya and Huang [1], in their Theorem 3, provide a general result 
on the unconditional limiting distribution of the bootstrap for argmax estimators 
that implies (2.5) and includes many other monotone function settings. Their result, 
in combination with our arguments and our Theorem 2.2, could thus probably be 
used to deduce bootstrap inconsistency for monotone function estimators in general. 

Note also that under the conditions of Lemma 3.1, 



sup 

te[o,i] 



f n (t)-f(t) =o P (n- 1 / 3 ). 



This means that the assumptions on the smoothness of / used in Groeneboom et al. 
[12] are so strong that we can construct a kernel density estimator that uniformly 
converges faster than the Grenander estimator. Thus, with these assumptions, the 
Grenander estimator is clearly not optimal. This raises the important question 
about whether the assumptions in Groeneboom et al. [12] can be relaxed to the point 
that there are no kernel density estimators superior to the Grenander estimator. 
Alternatively, is it possible to show that the assumptions cannot be relaxed? If this 
is the case, then the Grenander is generally not optimal for L\ confidence band 
construction. 

Perhaps a more pressing open problem is constructing valid uniform confidence 
bands for the Grenander estimator. It appears as if establishing the uniform rate, 
which seems to be n, 1 / 3 (logri,) _1 / 3 , is not too hard in comparison to establishing 
distributional convergence. As with the arguments used in Groeneboom et al. [12] 
for the L\ error, it appears as if the uniform error should converge to some extremum 
of the process |£(t)| defined in section 3 over some increasing interval [0, r„]. If this 
could be done, then the extremal limiting distribution results in Hooghicmstra and 
Lopuhaa [13] may be applicable, yielding an extreme value distribution in the limit 
after standardization. Establishing this, however, seems to be very difficult without 
results for convergence of empirical processes over noncompact index sets. A further 
question is whether this can be accomplished without imposing assumptions so 
strong that the primacy of the Grenander is lost (as seems to have happened in the 
L\ error case). This issue of lost primacy, of course, does not arise in the pointwisc 
error setting (Birge [4]). 

Finally, we note that these results and issues for the Grenander estimator have 
implications for the survival estimator under current status censoring as well as for 
monotone function estimation in general because of the similar argmax structures 
noted previously. 
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